Evaluating bias in retrieval systems for recall oriented documents retrieval

نویسندگان

  • Sanam Noor
  • Shariq Bashir
چکیده

The evaluation of a retrieval system has always been the focus of research. Most of the retrieval systems seem to be more efficient for precision oriented documents than recall oriented documents since there is a difference between both the recall and precision oriented documents. Therefore, a system that is efficient for the retrieval of precision oriented documents does not need to be good for recall oriented documents as well. Evaluation of retrieval system is very necessary in order to determine whether these methods are suitable for recall oriented documents retrieval or not. We evaluate different retrieval systems for recall oriented documents retrieval. Our main focus is on finding the bias in retrieval systems. We use different retrieval systems for evaluation; in which four are query expansion techniques while the other three retrieve documents without using query expansion techniques. Patent documents are used for analyzing the effectiveness of retrieval systems. Accessibility of documents is measured by retrievability measurement. Lorenz curve and Gini coefficient are used for measuring bias in systems. Our experiments results show that Term Frequency Inverse Document Frequency (TFIDF) is less biased. While exact method show high retrievability inequality. In query expansion techniques language modelling shows less inequality.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Review of ranked-based and unranked-based metrics for determining the effectiveness of search engines

Purpose: Traditionally, there have many metrics for evaluating the search engine, nevertheless various researchers’ proposed new metrics in recent years. Aware of this new metrics is essential to conduct research on evaluation of the search engine field. So, the purpose of this study was to provide an analysis of important and new metrics for evaluating the search engines. Methodology: This is ...

متن کامل

On the relationship between query characteristics and IR functions retrieval bias

Bias quantification of retrieval functions with the help of document retrievability scores has recently evolved as an important evaluation measure for recall-oriented retrieval applications.While numerous studies have evaluated retrieval bias of retrieval functions, solid validation of its impact on realistic types of queries is still limited. This is due to the lack of well-accepted criteria f...

متن کامل

Analyzing Document Retrievability in Patent Retrieval Settings

Most information retrieval settings, such as web search, are typically precision-oriented, i.e. they focus on retrieving a small number of highly relevant documents. However, in specific domains, such as patent retrieval or law, recall becomes more relevant than precision: in these cases the goal is to find all relevant documents, requiring algorithms to be tuned more towards recall at the cost...

متن کامل

Improving Retrievability and Recall by Automatic Corpus Partitioning

With increasing volumes of data, much effort has been devoted to finding the most suitable answer to an information need. However, in many domains, the question whether any specific information item can be found at all via a reasonable set of queries is essential. This concept of Retrievability of information has evolved into an important evaluation measure of IR systems in recall-oriented appl...

متن کامل

Retrieval Models versus Retrievability

Retrievability is an important measure in information retrieval that can be used to analyze retrieval models and document collections. Rather than just focusing on a set of few documents that are given in the form of relevance judgments, retrievability examines what is retrieved, how frequently it is retrieved, and how much effort is needed to retrieve it. Such a measure is of particular intere...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Int. Arab J. Inf. Technol.

دوره 12  شماره 

صفحات  -

تاریخ انتشار 2015